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We present a general model for quantum channels with memory, and show that it is sufficiently 
general to encompass all causal automata: any quantum process in which outputs up to some time 
t do not depend on inputs at times t' > t can be decomposed into a concatenated memory channel. 
We then examine and present different physical setups in which channels with memory may be 
operated for the transfer of (private) classical and quantum information. These include setups in 
which either the receiver or a malicious third party have control of the initializing memory. We 
introduce classical and quantum channel capacities for these settings, and give several examples to 
show that they may or may not coincide. Entropic upper bounds on the various channel capacities 
are given. For forgetful quantum channels, in which the effect of the initializing memory dies out as 
time increases, coding theorems are presented to show that these bounds may be saturated. Forgetful 
quantum channels are shown to be open and dense in the set of quantum memory channels. 
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I. INTRODUCTION 



Any processing of quantum information, be it storage 
or transfer, can be represented as a quantum channel: 
a completely positive and trace-preserving map S that 
transforms states (density matrices) on the sender's end 
of the channel into states on the receiver's end. Until 
now most of the work on quantum channels has con- 
centrated on memoryless channels, which are character- 
ized by the requirement that successive channel inputs 
are acted on independently. Mathematically, this means 
that messages of n symbols are processed by the tensor 
product channel S® n . 

However, in many real-world applications the assump- 
tion of having uncorrelated noise channels cannot be jus- 
tified, and memory effects need to be taken into account. 
It thus seems desirable to extend the theory of quan- 
tum channels to encompass memory effects, and to create 
a common framework in which experiments with both 
correlated and uncorrelated noise can be naturally de- 
scribed. In fact, such a framework is already necessary 
for estimates on almost memoryless channels, for instance 
when assessing whether a particular system can arguably 
be modelled as a memoryless channel. In the present pa- 
per such a unified framework will be presented, and it 
will be shown how this model can be applied to the de- 
scription of different information processing tasks, such 
as (private) classical and quantum information transfer. 
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FIG. 1: Left: A quantum memory channel with input register 
A, output register B, and memory system M. — Right: A 
threefold concatenation 53 of memory channels, with time 
running from left to right, and coded information running 
from bottom to top. 



FIG. 2: By the Structure Theorem, a causal automaton T can 
be decomposed into a chain of concatenated memory channels 
S plus some input initializer R. Evaluation with the identity 
operator 1 means that the corresponding output is ignored. 



A. Outline and Overview 

In our contribution we present a general model for 
quantum channels with memory. In addition to Alice's 
input register A and Bob's output register B, such a 
channel has an additional memory input and an addi- 
tional memory output, denoted by M (cf. Fig. ^ left). 
Long messages with n signal states will then be processed 
by subsequent application of these memory channels, re- 
sulting in the concatenated channel S n depicted in Fig. ^ 
(right). This picture will be turned into a rigorous defini- 
tion in Section IlII Al after the mathematical framework 
will have been introduced in Section ITU 

In such a setup, the memory system is passed on 
from one application of the channel to the next, and 
introduces (quantum or classical) correlations between 
consecutive signal states. If no memory system is 
present, the concatenated channel will simply be a 
product channel, bringing us back to the memoryless 
realm in which consecutive signal states are acted on 
independently. 

This model marks a constructive approach to quantum 
channels with memory. It is certainly the appropriate 
framework when the physical realization of the memory 
M. is known. However, in many applications of informa- 
tion theory only the input-output behavior of a channel is 
of interest. From this point of view the memory would be 
part of the internal workings of the channel, and would 
not be made part of the description. We call this way 
of describing channels the axiomatic approach: It takes a 
channel as a transformation turning infinite strings of in- 
put systems to infinite strings of outputs, with only two 
basic assumptions: translation invariance and the con- 
dition of causality. Outputs up to some time t do not 
depend on inputs at times t' > t. In the classical theory, 
such channels are sometimes called non- anticipatory. It 
is clear from Fig. ^ that a channel with memory auto- 
matically satisfies this causality condition. 

Taking a causal channel and representing it as a chan- 
nel with memory amounts to reconstructing a model of 
the channel and its internal memory states and dynam- 
ics. This is a highly non-trivial task, even in the classical 



case. However, a formal reconstruction can always be 
given. This is what we call the Structure Theorem for 
causal channels, and is illustrated in Fig. |21 A rigorous 
version will be given as Th. 0] in Section IIVI In general, 
it produces not only the channel step operator S, but 
also a map R defining the influence of input states in the 
remote past on the memory. Intuitively, however, such 
a map is often not needed, because memory effects de- 
crease in time. A similar condition is needed for passing 
from the constructive approach of channels with memory 
to causal input-output channels: Since the constructive 
approach allows one to choose the initial memory state, 
output states in general depend on this choice, and in 
general this influence will depend on the time after ini- 
tialization. So in order to get a time translation invariant 
channel without such dependence, the channel S must 
lose the initialization information. We call S forgetful 
if outputs at a large time t depend only weakly on the 
memory initialization at time zero, in a sense made pre- 
cise in Section For forgetful channels, memory effects 
will be shown to decrease even exponentially. 

Not every channel is forgetful. The prime counterex- 
ample is a channel with a global classical switch discussed 
in Section Till CI The memory in this case is a classical 
bit, left unchanged by S, but determining which of two 
memoryless channels So, Si is applied to the input at 
each time. However, we will show in Section [V] that 
generic memory channels are in fact forgetful, in the 
sense that every non-forgetful quantum channel can be 
approximated by a forgetful channel to arbitrary degree 
of accuracy. In addition, for every forgetful quantum 
channel we may find a finite-size neighborhood in which 
all channels are likewise forgetful. In mathematical 
terms, forgetful quantum channels are both open and 
dense in the set of quantum memory channels. 

For quantum channels with memory, capacity can be 
defined along the lines familiar from the memoryless set- 
ting 0,01, both for the transmission of classical and quan- 
tum information. Channel capacity expresses quantita- 
tively how well a given channel S can simulate a noiseless 
qubit (or bit) channel: roughly speaking, it is the maxi- 
mal number of ideal qubit (resp. bit) transmissions per 
use of the channel, taken in the limit of long messages 
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FIG. 3: Two signal states are encoded into three input reg- 
isters, sent through the concatenated memory channel, and 
then decoded into two output states. If the overall channel 
is (in some sense to be specified in Section IIII Bll close to 
the ideal channel on two inputs, the transmission rate of the 
above scheme is |. Capacity is the largest such rate, in the 
limit of long messages and optimal encoding and decoding. In 
the above setup, the initial memory input can be thought of 
as being controlled by either the sender or a malicious third 
party. Similarly, the receiver may or may not be able to read 
out the final memory state. 



and using encoding and decoding schemes asymptotically 
eliminating all errors. The concept is illustrated in Fig. [3] 
However, when trying to send information through 
a concatenated memory channel, unlike in the memo- 
ryless case we also have to specify how to handle the 
initial and final memory state. In particular, we may 
distinguish between setups in which Alice can access 
the initial memory input state and may use it for the 
encoding procedure, and setups in which a malicious 
third party (Eve, say) controls the initial memory 
input, and by her choice of the input state will try to 
prevent Alice and Bob from communicating over the 
channel. Likewise, we may consider setups in which 
either Bob or Eve control the final memory output. 
These distinctions will be made precise in Section Till Bl 
They lead to slight variations in the notion of capacity, 
and in Section 1111 CI we will present several examples 
to show that the resulting capacities may or may not 
coincide. In particular, for channels with only one 
Kraus operator, all these capacities are the same, and 
equal the capacity of the ideal channel (cf. Section llll Dfl . 

The various capacities can be bounded from above 
both in terms of the capacity of memoryless channels 
and in terms of entropic expressions. Some of these 
bounds will be presented. In particular, the standard 
mutual information and coherent information bounds 
familiar from the memoryless setting easily extend to 
memory channels (cf. Section IVI Afl . 

Forgetful channels are, in a sense to be specified in 
Section [Vl close to memoryless channels. As such, they 
play a central role not only as the bridge between the 
axiomatic and the constructive approach to quantum 
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FIG. 4: An unmodulated spin chain as a quantum channel 
with memory: Alice places the input signal on the first spin 
of the chain and lets it propagate to Bob, who controls the 
spin at the opposite end of the chain. 




FIG. 5: In a micromaser, a stream of two-level atoms is 
injected into a high-quality superconducting cavity. The field 
modes introduce correlations between consecutive atoms. 



memory channels and as generic examples for quan- 
tum memory channels, but also connect them to the 
memoryless realm. In Section IVI Bl we will explain 
how the standard random coding techniques familiar 
from the memoryless setting can be modified to saturate 
the entropic upper bounds on the channel capacity 
for forgetful channels, leading to coding theorems for 
(private) classical and quantum information transfer for 
this very important class of memory channels. 

We conclude with a Summary and Outlook. An Ap- 
pendix contains some mathematical background relevant 
to the description of infinite-dimensional quantum sys- 
tems, insofar as it is essential to the understanding of 
the Structure Theorem. 



B. Model Systems and Related Work 

Quantum channels which naturally acquire a memory 
are abundant in all branches of quantum information pro- 
cessing: 

Recently, an unmodulated spin chain has been pro- 
posed as a model for short distance quantum communi- 
cation [H 0, |H El • In sucn a scheme, the state to be com- 
municated over the channel is placed on one of the spins 
of the chain, propagates for a specific amount of time, 
and is then received at a distant spin of the chain (cf. 
Fig. 2}. When viewed as a model for quantum commu- 
nication, it is generally assumed that a reset of the spin 
chain occurs after each signal \J\, for example by apply- 
ing an external magnetic field, resulting in a memoryless 
channel. However, a continuous operation without reset 
may lead to higher transmission rates, and corresponds 
to a quantum channel with memory. 

Another model of a quantum channel with memory is 
the so-called one-atom maser or micromaser In 
such a device, excited atoms interact with the photon 
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field inside a high-quality optical cavity, as depicted in 
Fig. [5] If the photons inside the cavity have sufficiently 
long lifetime, atoms entering the cavity will feel the 
effect of the preceding atoms, introducing correlations 
between consecutive signal states. 

Apparently, the first model of a quantum channel with 
memory was introduced by Macchiavello et al. in 2001: 
they gave an example of a qubit channel with Marko- 
vian correlated noise [ToL in which entangled input 
states may increase the transmission rate for classical in- 
formation. These results have recently been extended to 
some bosonic Gaussian channels 0, U| • Such an effect 
has been demonstrated experimentally for optical fiber 
channels with fluctuating birefringence, in which consec- 
utive light pulses under gp s trongly correlated polariza- 
tion transformation [l4L fl5j | . (Whether such examples 
exist in the memoryless setting is still an open question, 
and presently considered one of the most eminent open 
problems of quantum information theory, with wide im- 
plications for other problems in the field [T^.[l7|.') 

Subsequently, the study of quantum channels with 
memory has largely been confined to channels with 
Markovian correlated noise (cf. 0, 0] and references 
therein). A Lindbladian approach to memory channels 
has been taken by Daffer et al. [13, 0|- Upper bounds 
on the classical capacity for a more general class of chan- 
nels have been given recently by Bowen et al. p^. 

All the memory channels discussed in this Section are 
causal quantum channels, and thus the Structure Theo- 
rem applies. A completely different approach has been 
taken by Hayashi and Nagaoka [2j| , who refrain from im- 
posing any structural assumption on the quantum chan- 
nels they consider, and apply the information-spectrum 
method to obtain a coding theorem for the classical prod- 
uct state capacity, following work by Verdii and Han [24| 
on classical channels with memory. 

We refer to Verdu's overview paper [2^ and the Gray- 
Davisson collection 26] for more information on memory 
channels in the purely classical setting. 



II. LANGUAGE AND NOTATIONS 
A. States, Channels, and Observables 

According to the rules of quantum mechanics, every 
quantum system is associated with a Hilbert space TL, 
which for the purpose of this paper can mostly (but not 
always, see the discussion in Section IIIB|I be taken as 
finite dimensional. The observables of the system are 
given by bounded linear operators on the Hilbert space 
TL, written B(TL). The physical states associated with the 
system are density operators g £ B* (TL) , where the latter 
denotes the space of trace class operators on TL. 

A quantum channel S which transforms input systems 
described by a Hilbert space TL\ into output systems de- 
scribed by a (possibly different) Hilbert space TL2 is rep- 



resented mathematically by a completely positive unital 
map S: 6(0.2) —* B(TL\). By unitality we mean that 
S(ln 2 ) = J-Hi, with the identity operator 1« 4 S B(TLi). 
Each channel S can be written in the so-called Kraus 
form [23 

n 

S(X) = J2<Xs i (1) 

8=1 

with a number of n < dim(7ii) dim(7i2) Kraus operators 
Si: TL\ — > 7^2- 

The physical interpretation of the quantum channel S 
is the following: when the system is initially in the state 
g € B*(TLi), the expectation value of the measurement 
of the observable X e B(TL2) at the output side of the 
channel is given in terms of S by tr(g S(X)). 

Alternatively, and perhaps more intuitively, we can 
look at the dynamics of the states and introduce the dual 
map S*: #*(7ii) — > B*(H.2) by means of the duality rela- 
tion 

tr(S*(g)X)=tr(gS(X)). (2) 

S* is a completely positive and trace-preserving map 
and represents the channel in Schrddinger picture, while 
S provides the Heisenberg picture representation (cf. 
Davies' textbook [28| and Keyl's survey article for 
a more extensive discussion of observables, states, and 
channels). 

B. Heisenberg vs. Schrodinger 

For the finite dimensional systems we will consider in 
Section 11111 Schrodinger picture and Heisenberg picture 
are completely equivalent descriptions of quantum pro- 
cesses by means of the duality relation Eq. However, 
in the axiomatic characterization of quantum channels, as 
presented in Section lTVI we will have to deal with infinite- 
dimensional systems, for which Heisenberg picture is the 
mandatory language. Thus, for consistency we work 
in Heisenberg picture throughout, emphasizing that for 
finite-dimensional systems conversion to Schrodinger pic- 
ture is always immediate from Eq. J2J ■ Some mathemati- 
cal background on the description of infinite-dimensional 
systems, insofar as it is essential to the understanding of 
the present paper, is relegated to the Appendix. Most no- 
tably, this includes quasi-local algebras and Stinespring's 
dilation theorem. 



C. Distance between Quantum Channels 

From the informal discussion in Section ITaI it is clear 
that the definition of channel capacity requires the com- 
parison of the quantum channel after the encoding and 
decoding process with an ideal channel. As a measure of 
the distance between two channels we favor the norm of 
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complete boundedness, (or cb-norm, for short) [29| . de- 
noted by || • || c b. For two channels T and S, the dis- 
tance \\\T - S\\ cb can be defined as the largest differ- 
ence between the overall probabilities in two statistical 
quantum experiments differing only by exchanging one 
use of S by one use of T. These experiments may in- 
volve entangling the systems on which the channels act 
with arbitrary further systems. Equivalently, we may set 
||T|| c i, = sup n UTigiidnlloo, where || • ||oo denotes the norm 
of linear operators between the Banach spaces B(Tti) (cf. 
Appendix) , and id n denotes the identity map (ideal chan- 
nel) on the n x n matrices. 

Among the properties which make the cb-norm well- 
suited for capacity estimates are norm multiplicativity, 
||Ti <g> T 2 || c6 - HTxIUIlTalU, and unitality, ||T|| c6 = 1 
for any channel T. The equivalence with other error cri- 
teria such as minimum fidelity and entanglement fidelity 
is discussed extensively in 

When working in the Schrodinger picture represen- 
tation, the so-called trace norm \\q\\\ := tr \/g*g is 
frequently used to evaluate the distance between two 
quantum states. Again we refer to for the equivalence 
with other distance measures. 

Note that throughout this work we use base two loga- 
rithms, and we write Ida; := log 2 x. 



III. CHANNELS WITH MEMORY 

A. The Constructive Approach 

A relatively simple (yet surprisingly general, see below) 
model to describe channels with correlated noise consists 
of a quantum channel which, in addition to Alice's in- 
put register system TLa and Bob's output register system 
TLb has an additional memory input TLm and an addi- 
tional memory output TLm' ■ (Since the smaller of the 
two Hilbert space TLm > 7~Lm' can always be thought of as 
being embedded in the larger one, in the following we will 
assume without loss that TLm = TLm 1 •) Mathematically, 
a quantum channel with memory (or, for short, memory 
channel) is represented (in Heisenberg picture) as a com- 
pletely positive and unital map S:B(TLb) ® B(TLm) — ► 
B(TL M ) O B(TL a )- Often we will abbreviate B(TL A ) to 
A, and similarly for B(TLb) and B(TLm)- Long messages 
with n G N signal states will then be processed by sub- 
sequent application of memory channels, resulting in the 
concatenated channel S n : B® n <g> M — > M ® A® n given 
as follows (see Fig. 0: 

S„ = (5®id5 n - 1 )o...o(id|"- 2 ®5®id^)o(id^ 1 ®^), 

(3) 

where id denotes the identity operation (ideal or noiseless 
channel): id (A) = X V X. 

The Schrodinger picture equivalent of this model was 
introduced by Bowen and Mancini in [l9| and has been 
shown to encompass channels with Markovian correlated 



noise discussed previously in 0, 0, [2(j • As adver- 
tised in the Introduction, in Section HVI we will show that 
this model is sufficiently general to describe all causal 
quantum channel, which was left as an open problem in 
19]. However, before we prove the Structure Theorem 
we will extend the notion of channel capacity from the 
memoryless setting to channels with memory, and we will 
present several different setups in which these channels 
may be operated for the transmission of both classical 
and quantum information. 

B. Channel Capacity 

As explained in Section ITAl the standard definition of 
capacity applies also to quantum channels with memory. 
However, as illustrated in Fig.[3]we have to specify how to 
handle the initial and final memory states. In particular, 
we need to distinguish between setups in which Alice has 
control over the initial memory input state and may use 
it for the encoding procedure, and setups in which a ma- 
licious third party (Eve, say) controls the initial memory 
input, and by her choice of the input state \i € B*(TLm) 
will try to prevent Alice and Bob from communicating 
over the channel. Likewise, we may consider setups in 
which the final memory states is either ignored or acces- 
sible to Bob, and can thus be employed in the decoding 
process. 

In the definition of channel capacity presented below, 
these four different scenarios are distinguished by a dif- 
ferent range and domain of the encoding and decoding 
map, respectively, and give rise to four different chan- 
nel capacities for both classical and quantum information 
transmission. 

Definition 1 Let TLa, TLb, md TLm be Hilbert spaces. 
A positive number R is called an achievable rate for 
the quantum memory channel S:B(TLb) ® B(TLm) — > 
B(TLm) ® B(TLa) iff for any pair of integer sequences 
(n l/ ) ! ygN and (m u )uen with lim,,—^ n v = oo and 
liniy^oc ^p- < R we have 

lim A(n„,m„) = 0, (4) 

where we set 

AK,m„) := inf \\E S nv D - id®" 1 " || ch , (5) 

the infimum taken over all encoding channels E and de- 
coding channels D with suitable domain and range. 
The quantum channel capacity Q(S) of the memory 
channel S is defined to be the supremum of all achiev- 
able rates. 

In the different setups described above, the domain of the 
encoding channels E may or may not include the initial 
memory algebra B(TLm), a>nd the range of the decoding 
channels D may or may not contain the final memory al- 
gebra B(TLm), resulting in four different quantum capac- 
ities Qab(S), Qae(S), Qeb,(i(S), and Qee,h(S), where 
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the first index stands for the party (Alice, Bob, or Eve) 
who controls the initial memory state, the second index 
stands for the party who has access to the final memory 
state, and fj, G S*(7Ym) stands for Eve's choice of the 
initial memory state, if applicable. 

Remark 1 The capacity of a quantum memory channel 
S for the transmission of classical information can be de- 
fined along the same lines, restricting encoding channels 
to p reparations and decoding channels to measurements 
[30| . and replacing the ideal qubit channel id £2 by the 
ideal bit channel in Eq. JSJ. The respective capacities are 
denoted by Cab(S), Cae(S), CebAS), and Cee^(S), 
and are no smaller than their quantum counterparts. 

Remark 2 In the sections to follow, we will write Q*(S) 
and C* (S) whenever a certain statement holds for all the 
four channel capacities introduced in Def. regardless 
of Eve's choice of the initial memory state. 

Remark 3 It is obvious from the definition that for ev- 
ery memory channel S the capacities introduced in Dcf.^ 
satisfy the following chain of inequalities: 

QeeAS) < {Qae(S), Qeb,h} < Qab(S) (6) 

for all \x £ B*(Hm), and accordingly for the classical 
capacities Cee,p,{S) etc. 

Remark 4 Note that there are several equivalent defini- 
tions of channel capacity. In particular, it is sufficient to 
find one pair of integer sequences (rv)^eN and (m^eN 
such that lim^oo ^ = R and lim,,-^ A(n„,m„) = 0, 
provided the diverging sequence {n v ) ue -^ is subexponen- 
tial, i.e., lim^oo ^^±1 = 1. 

In addition, the cb-norm in Eq. ((SJ can be replaced by 
other distance measures such as minimum fidelity or en- 
tanglement fidelity. See |l( for a detailed discussion of 
these matters. 



C. Examples 

In the following, in order to illustrate the concepts in- 
troduced above we will present several examples of quan- 
tum memory channels. These examples will also serve to 
show that the different capacities introduced in Def. 2] 
may or may not coincide, thereby justifying our defining 
more than one capacity. 

A simple model channel for which all the capacities in- 
troduced above coincide is the Shift Channel S s . In prin- 
ciple, this is just a noiseless channel, but it interchanges 
memory and input register: S s (b (8 m) — b eg) m (Note 
that in the tensor representation that we have chosen, 
the identity channel id comes with the inherent flip, i. e., 
id (b (g> m) = m Cg) b.) Thus, in an n-fold concatenation of 
Shift Channels, the signals that Alice sends through the 
channel will be received by Bob undistorted one time-step 
later. In the capacity limit of long messages, as n — > oo, 



the initial qubit that Bob may lose if Eve controls the 
initial memory state, and the final qubit that he may 
lose if he cannot access the final memory state both have 
a negligible impact on the transmission rate, and there- 
fore QeeAS s ) = lim IWOO ^ldrf = Idd V /* e B*(H M ), 
with d := dirali.A = dim 'He = dimWjvf. Therefore, by 
Eq. © and Remark ^ all the above capacities equal ldd. 
Further examples for channels in which the worst-case ca- 
pacity and the best-case capacity are both maximal will 
be presented in Section ITll PI 

An example of a memory channel in which the control 
over the initializing memory state can have a decisive in- 
fluence on the channel performance is the channel with a 
global classical switch: Suppose that the memory algebra 
is a classical d-level system of diagonal d x d matrices, 
and that we are given a collection {Ti}f =1 of d quantum 
memoryless channels Tf. B — ► A. Then a quantum mem- 
ory channel S:B ® M. M.® A with a global classical 
switch (d settings) is given by 

d 

S{b®m) = ^{i\m\i)\i){i\®T l (b). (7) 
»=i 

In an n-fold concatenation of this channel, the channel 
Tjj is applied in every time step if the initial memory in- 
put state was \i){i\. If Alice initially sends a pre-defined 
sequence of test states, Bob may find out what the ini- 
tial memory setting was and choose the decoding channel 
accordingly. Thus, the best case capacity in this setting 
will be maxj = i ! .. ( j {Q(Tj)}, and the worst case capacity 
will be no larger than minj = i 1 .. < j {Q(Tj)}. These two may 
clearly differ. 



D. Pure Channels 

Pure memory channels arc channels which have only 
one Kraus operator in Eq. . From the unitality condi- 
tion, 5(1) = 1, it is then clear that these channels have 
a Kraus representation S(b 0m) = V*(b <g> m)V with 
isometric V : Wm <g> Ha — * 7~Cb ® W« . 

In this section we will show that for pure channels with 
finite memory, the various capacities introduced in Def. ^ 
coincide and are maximal, i. e., we have the following 

Theorem 1 Let S:B(H B )®B(H M ) -»• B{H M )®B{H A ) 
be a pure quantum memory channel with finite memory 
algebra B(TLm)- With the convention introduced in Re- 
mark\^we then have: 

Q*(S) = min {Id dimttA,ld dimfts} = C*{S). (8) 

Our strategy for the proof is to show that for pure chan- 
nels it is possible to satisfy the Knill-Laflamme error cor- 
rection criteria |3ll |. which imply that perfect signal re- 
covery can be achieved. This is even more than what 
is required for capacity purposes, since the definition of 
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channel capacity, as presented in Section IlII Bl only de- 
mands that errors vanish asymptotically, i. e., in the limit 
of long messages n — > oo. 

Since we will have to refer to them repeatedly in the 
course of the proof, we start by restating the Knill- 
Laflamme conditions for perfect error correction (cf. Th. 
10.1 in [12): A necessary and sufficient condition for 
a quantum channel T:B{TL2) —* B{H\) with Kraus op- 
erators {ti} i=1 to be completely correctable on a sub- 
space JC c Wi is the existence of an orthonormal basis 

{| a )}a™' C °f ^ sucn that 

(a\t*t j \(3)=u H!j {a\[3), (9) 

where the coefficients u>ij G C are not permitted to de- 
pend on the basis labels a, f3. If the orthonormal basis 
{|a)} a C Tii has N elements, we say that there exists a 
quantum code of dimension N. 

Coming back to pure channels, we see that in the setup 
in which Alice controls the initial memory state and Bob 
can read out the final memory state there is only one 
(isometric) Kraus operator V, and thus it is straight- 
forward to satisfy Eq. I© and achieve rates of up to 
min {Id dim 7^, Id dhnHs}- 

By Eq. ® and Remark □ in order to complete the 
proof of Th. it is therefore sufficient to show that 
Qee,(j. > min {Id dim Ha, Id dim Hb} V fi G B*{Hm)- 
Again we will show that it is possible to satisfy the error- 
correction conditions Eq. 0. However, in the worst-case 
scenario in which Eve chooses an arbitrary input state 
/j. G B*(Hm) and Bob has no control over the final mem- 
ory output the resulting channel is no longer pure, but 
can be given a Kraus representation with no more than 
d 2 M Kraus operators, where cIm '■= dim.Ti.M- 

Lemma 2 Let Ha, Hb, an d Hm be finite- dimensional 
Hilbert spaces, and let du '■= dhsxHM- Suppose that 
S:B{Hb)®B{TLm) — ► B{TLm)®B(TLa) is a pure quantum 
channel, i. e., S(b <X> m) = V*(b (8 m)V for isometric 

V : Km ®H a ^H b ® H m - Let S„:B(H B ) -> B(H A ) 
be the restriction of S to the B -system, with fixed initial 
memory state ^ G B*(Hm)- Then can be given a 
Kraus representation with d 2 M Kraus operators. 

Proof: Let {la}}^! be the eigenbasis of [i G B*(Hm), 

and suppose that {1*}}^ and {|i')}j'=i are orthonor- 
mal bases for Ha and Hb, respectively. The isometry 

V :Hm ® Ha Hb ®Hm can then be given the repre- 
sentation 

du 

V= ]T V a ,f,®\aW\ (10) 

a, 0=1 

with operators V a .p = Et'iE.ti (f, <*\V\0,i) 
From Eq. (|10|l we see that for arbitrary g G B*(Ha) and 



b G B(Hb) we have 

tr (q®h) V*{b® 1 M )V 

cIm 
a,/3,7=l 

= W> tr (e K,p b V^p) (11) 
= tr g S^(b), 

where we have set s M , Q /3 := ^JJTp V a .p, and {np}n^-i are 
the eigenvalues of fi G B*(Hm)- Thus, the restricted 
channel can be given a representation with d 2 M Kraus 
operators, as claimed. ■ 

Note that in this representation the number of Kraus 
operators is independent of the dimension of both Alice's 
and Bob's systems Ha and Hb, and thus the above re- 
sult holds true also for the concatenated memory channel 
S n :B{H B )® n ® B(H M ) B{H M ) <8> B{H A )® n , indepen- 
dently of n G N. Consequently, in the limit n — > oo of 
long messages our setup corresponds to a channel with 
large input space interacting with a small environment. 
Physical intuition suggests that in such a setup the loss 
of information to the environment should be negligible, 
and it should be possible to operate the channel like an 
almost ideal one. This is the essence of the following 

Lemma 3 Let T: B(H) — > B(H) be a channel with K 
Kraus operators. Then there exists a quantum code of 

dimension at least j . 

Proof: Let {ii} i=1 be a set of Kraus operators for T, and 
let n t j := t*tj . In order to find a subspace JC C H of high 
dimensionality such that the Knill-Laflamme conditions 
Eq. © are satisfied, the following strategy may seem 
promising: Choose a state vector (pi G H arbitrarily, and 
then choose 

K 

f2 G /Ci := <fi H p| (r i ,^ 1 ) X . (12) 
i,3'=l 

Iterate this procedure of successive removal of dimen- 
sions until no further state vectors can be found. In every 
step, at most K 2 dimensions are removed, so this strategy 
yields a subspace of dimension > dl ™^ . Unfortunately, 
this procedure does not guarantee that inner products 
(<Pa\Ti,j\<p a ) are independent of the basis labels, as re- 
quired by the Knill-Laflamme conditions Eq. (J5J). How- 
ever, this can be accomplished by a carefully balanced 
pairing of eigenvectors, at the expense of a smaller code 
space: 
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Note that any operator r e B(Ti) can be written as the 
weighted sum of two Hermitian operators, r = + 
with r + := t* + r and r_ := i(r* — r). Since the Knill- 
Lafiamme conditions Eq. @ are linear in the operators 
Tij, we may assume without loss that all operators Tjj 
are Hermitian. Let r be one of these operators, and let 
{A Q }^ =1 be the set of its eigenvalues, where d := dim7i 
and multiple eigenvalues appear according to their mul- 
tiplicity. Choose uj <E R such that equally many of the 
real numbers fj, a := X a — ui lie on the positive and on 
the negative axis. (If necessary, reduce the dimension of 
H by one.) Now, if ijj a is some eigenvector of the opera- 
tor t — u>t corresponding to the eigenvalue [i a > 0, and 
tp-a is an eigenvector corresponding to the eigenvalue 
/i_ Q < 0, by setting 



1 



(13) 



we obtain a Hilbert space ICi := lin {ip a \ a = 1, |} 
of dimension | satisfying the Knill-Laflammc conditions 
Eq. © for the operator t, i. e., 



((p a \r - u)l\fp) =0 V a,f3 = 1, 



d 
2' 



(14) 



Now, choose another operator r' S {Ti,j}^j = i 
repeat the above pairing procedure on the subspace JCi, 
resulting in a subspace JC2 C of dimension |. After 
i<f 2 steps, the resulting subspace has dimension at least 
— which is the desired result. ■ 

We can now complete the proof of Theorem ^ Ap- 
plying the Knill-Laflamme code described in the proof 
of Lemma to the concatenated memory channel S^n 
with d 2 M Kraus operators, we immediately see that for 
all neB*(H u ) 



1 d n 

Qee,u > lim - Id — — = Id d, 

n— >oo n 2 m 



(15) 



where d := min{ld dim7i^,ld dim?^}, a s claimed. ■ 

After completion of the present work we learned that 
closely related results on channels interacting with small 
environments have been obtained independently by G. 
Bowen and S. Mancini [3i|. These authors also show 
that for such channels the Knill-Laflamme error correc- 
tion conditions can be fulfilled. However, instead of the 
pairing of eigenvalues described in the proof of Lemma[31 
their approach uses convex sets arguments of Knill et al. 
[34| , which are based on a generalization of Radon's theo- 
rem |35j . Our approach seems more straightforward, but 
this comes at the expense of a weaker estimate, since the 
more sophisticated strategy of Knill et al. yields a code 
of dimension > j^TKWP) ' 



IV. THE STRUCTURE OF CAUSAL 
CHANNELS 

In the first part of this work we have followed a con- 
structive approach to quantum channels with memory, in 
the sense that quantum channels which process long mes- 
sages were always thought of as concatenations of smaller 
units which process one quantum signal each. In this sec- 
tion we take the alternative view and assume that we are 
a priori given a quantum channel on a long (possibly infi- 
nite) message string. Our interest is then in the internal 
structure of such a quantum channel. As advertised in 
the Introduction, we will show in Th.0|that under very 
general assumptions it can be decomposed into a chain 
of quantum memory channels. 

This result requires some mathematical background 
from the theory of infinite-dimensional quantum systems 
and channel representations, most notably quasi- local 
algebras and the uniqueness of the minimal Stinespring 
dilation. The relevant material is collected in the 
Appendix. 

To set the stage, imagine that we have at our disposal 
a quantum channel which, at every discrete time step, 
transforms an input state on some observable algebra A 
into an output state on some (possibly different) observ- 
able algebra B. It is represented (in Heiscnberg picture) 
by a completely positive and unital map T: Bz — * Az be- 
tween the quasi-local algebras Az and Bz on Alice's and 
Bob's side of the channel, respectively. In the following, 
we will restrict ourselves to translational invariant chan- 
nels, i.e., we assume that T commutes with the shift on 
the spin chain: <r_4 o T = T o erg. In addition, we impose 
the physically reasonable constraint that outputs up to 
some time t do not depend on inputs at times t' > t, 
leading to the following 

Definition 2 A causal channel T: Bz — > Az is a com- 
pletely positive and unital translational invariant map 
such that for every z G Z 



TQ>{-ca,z\ ® l[z+l,oo)) = 7 1 (6(_ 00>2 ]) 



L [z+l,oo) 



(16) 



for all 6(_ c 



Bearing in mind that T is translational invariant, we 
will henceforth set z = 0, and we will use the short-hands 
A- := ^4.(-oo,o] an( l := -4[i,oo) to denote the left and 
right half chain, respectively. B~ and B+ are defined 
analogously. 

It is obvious from the definition that a concatenated 
memory channel satisfies the causality property Eq. i|16fl . 
In this section we will prove the converse: every causal 
channel can be represented as a concatenated memory 
channel. Thus, we have the following Structure Theorem 
for causal channels (cf. Fig. [21 : 

Theorem 4 Let T: Bz — > Az be a causal channel. Ignore 
its outputs on the left half chain B- . Then there exists a 
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memory observable algebra A4 and an initializing channel 
R: M — > A- such that V n € N 

T(l_®6 n ) = (i?®id® n ) 5„(6„® 1 M ) (17) 

/or aZZ 6 ra € B[i, n ] — B® n , where S n is the n-fold concate- 
nation of a memory channel S: B ® M. — > M. ® .4, c/. 

Proof: In the finite-dimensional setup, a corresponding 
theorem has been proved by Eggeling et al. [3(| . Here we 
generalize this result to channels on quasi-local algebras. 
The Appendix contains all the background information 
and terminology relevant to the proof of the theorem. 
As in the finite-dimensional setting, the uniqueness of 
the minimal Stinespring representation will play a crucial 
role. 

Let TL the Hilbert space associated with the universal 
representation of the left half chain A-. Note that in 
general TL will not be separable. However, separability 
is not required in Stinespring's Theorem. Suppose that 
(/C, 7r, V) is a minimal Stinespring dilation for T |g_ , i. e., 



T(b) = V* tt(6) V V 6 € B_ 



(18) 



for some Stinespring isometry V:TL — > /C. In the se- 
quel, we will make repeated use of the Hilbert space iso- 
morphism TL ~ TL ® C|™ (cf. Ch. 3 of Kreyszig's text 
H3), where A = B(C d ) for some d € H. From Stine- 
spring's representation Eq. l|18fl and the causality prop- 
erty Eq. (|16fl . we may then conclude that 

V*n(b® If") V = T(b® lf n ) 

= T(b) <g> 1®" (19) 
= (V* <8> 15") (tt(6) ® lj n ) (V Ig") 

for all 6 € B_. Since V is a minimal dilation for T, so 
is V ® ij™ for T ® lg n . As explained in Section El of 
the Appendix, we may then conclude that there exists an 
isometry W n :K® Cf ™ -> /C defined by 



W„(7r(6)(g)l2 n )(F(g)lS n ) VOVn := ^{b®tf) V ip®ip n 

(20) 

for all b e B_, ip £ TL and ip n £ A® 71 such that 

n(b ® If") T^„ = W n (tt(6) ® l® n ) (21) 
for all b £ B-, and 

W n (V® IS") - V. (22) 

We are now in a position to reconstruct the memory alge- 
bra: Let M := 7r'(B_), the commutant of the observable 
algebra B_, and let S n :B® n B(K) <8> B(C|") be 

defined by 



5 n (6®m) := W*n(b)mW n 



(23) 



for all b £ 0_ and m 6 M. The memory initializing 
channel R: M. A- is given by 



In order to justify these choices, we will first show that 

S n {B® n ®M) c M®A® n . (25) 

Noting that 7r(l e _ <g> B® n ) M C tt'(B_ <g> if"), we see 
from Eq. J2U that 



-a <Sm 
1 A 



W* ir(l B _ $ b n ) m W n (n(b B _) 

= W* tt(1 b _ <g> 6„) m ^(o B _ If") VK„ 
= W* ^(o B _ ® if") tt(1 b _ ® b n )mW n 
\{b B _) ® l® n ) W* n(l B _ ® b n ) m W n 



(26) 



for all b n £ B® n and b B _ £ B- , implying that 



S n {b n ®m) | n(b B _) = 0, 



(27) 



R(jn) := V* m V V m £ M. 



(24) 



from which Eq. I|25|l directly follows. To complete the 
proof, it suffices to show that S n has the right concate- 
nation properties, i.e., 

R{m) = (Rt&idf 1 ) S n (l® n ®m) and (28) 
T(b) = (R®id® n )S n (b®lM) (29) 

for all m € M. and b £ B® n . However, this is immediate 
from the definitions of S n and R and Eq. 1)22(1. The 
result then follows by setting S := Sx- ■ 

As can be seen from the above reasoning, the commu- 
tant algebra M. can be replaced by the von Neumann 
algebra generated by all elements (idjt<8o; n ) S n {b n ®\M). 
However, note that in the above construction there is no 
unique way of choosing the memory algebra: given an 
infinite chain of memory channels with memory algebra 
A4, considering it as a causal channel and applying 
the memory reconstruction as in the proof of Th. 0] 
will in general yield a different memory algebra M.' ^ M.. 

It is clear from the proof of Th. 0] that the channel 
reconstruction will in general explicitly depend on the 
input initializer R, which describes the influence of in- 
put states in the remote past on the memory. In the 
following section we will turn our attention to an im- 
portant class of memory channels for which the memory 
initializer becomes completely irrelevant. These so-called 
forgetful channels therefore bridge the axiomatic and the 
constructive approach to quantum channels with mem- 
ory. We will also show that generic memory channels are 
forgetful. 



V. FORGETFUL CHANNELS 

Forgetful channels are quantum memory channels 
S:B <g> M — > M ® A in which the effect of the initial- 
izing memory state dies away with time. More formally, 
we have the following 



10 



Definition 3 Let S:B(g)Ai — > M. ® A be a quantum 
memory channel, S n its n-fold concatenation, and let 
S n :M — > M.®A® n be the concatenated channel in which 
Bob's outputs are ignored: S n (m) := S n (tg n <X>m) for all 
m G M. . Then S is called forgetful iff there exists a se- 
quence of quantum channels S n : M. — > A® n such that 

lim \\S n -l M ®S n \\ cb = Q. (30) 

n — >oo 

As an illustrative example, let's consider the classically 
mixed channel S := pid + (1— p) S s , where p G [0, 1), and 
S s denotes the shift channel introduced in Section ITlI CI 
When this channel is concatenated, in every step either 
the ideal channel or the shift channel is chosen with prob- 
abilities p and 1 — p, respectively. The only possible way 
for an n-fold concatenation S n not to be forgetful is to 
choose the ideal channel id in every step. However, the 
probability for this event is p n , and thus vanishes in the 
limit n — > oo, implying that Eq. (|3U[I holds. 

Remark 5 Note that Def. can be relaxed by requir- 
ing only that (S n ) n eN is a sequence of linear maps, 
yet not necessarily channels. To see that this leads 
to an equivalent definition of forgetfulness, assume that 
\\S n — 1m ® S n \\ c b < e for some s > 0, n G N, and some 
linear operator S n . Replacing 1m <S> S n with the quan- 
tum channel (P <g> id 5") ° S n , where P: M — >• C o 1 M is 
the completely depolarizing channel, we see that 

\\S n -{P®\d®?)oS n \\ cb 

< \\S n - 1 M ® Sn\\cb 

+ \\(P®id% n )o(l M ®S n -S n )\\ cb 

< 2\\S„ - l M ®S n \\ cb < 2e, 

and thus lim n _ ) . 00 ||5„ - (Ptgiid^™) ° S n \\ cb = 0, implying 
that S is indeed forgetful in the sense of Def. 

There exist several equivalent criteria for a quantum 
memory channel to be forgetful. In particular, it is suffi- 
cient to show that the norm distance \\S n — 1m ® Snllcb 
falls below 1 for some n G N. What is more important, 
the memory effects can always be assumed to vanish ex- 
ponentially fast. In addition, if the memory algebra M. 
has finite dimension, the cb-norm criterion Eq. (|3L)|) can 
be replaced by the usual operator norm || • ||oo- In fact, 
we have the following 

Proposition 5 Let S:B®A4 — > M. <£> A be a quantum 
memory channel, and for n G N let S n be defined as in 
Def. Then S is forgetful iff there exists an integer 
N G N and some linear operator Sn'-M — ► A® N (not 
necessarily a channel) such that 

\\S N -l M ®S N \\ cb <l. (32) 

Assume in addition that the memory algebra M. has finite 
dimension. Then S is forgetful iff for every m G M, 



and e > we may find a positive integer N G N and 
a N G A® N such that 

||£/v(m) - 1m ® ajv||oo < £ IMloo- (33) 

As advertised above, in the proof of Prop. [5] we will also 
be concerned with the speed of convergence in Eq. 
In this context, the following Lemma will be helpful: 

Lemma 6 Let (d„)„ e N be a positive and non-increasing 
sequence satisfying the subadditivity inequality 

d n +m < d n d m V n, m G N. (34) 

Assume further that d/v < 1 for some N G N. Then 

d n < c" V n > N (35) 

for some constant c < 1, i. e., (dn)neN vanishes exponen- 
tially. 

Proof of Lemma |6j Assume that djv < 1 for some 
N G N. From the subadditivity inequality (|34(l we then 
see that d]y+N < d 2 N , and, by induction, d„N < d u N for 
all v G N. By the monotonicity of (d n )neN we may then 
conclude that for n G [vN, (y + 1)N) we have 

d n < d uN <d v N < {dff = c n (36) 

i 

with c := djy 7 < 1, as advertised. ■ 

For the second part of the proof of Prop. we 
obviously need to bound the cb-norm || • \\ cb of a linear 
operator R:B(Hm) — * A with dimTi.M < oo in terms 
of its operator norm || • ||oo. This is the essence of the 
following 

Lemma 7 Let R:B{TLm) ~ * A be a linear operator, and 
assume that du '■— diiaUM < oo. We then have 

\\R\\cb < d 2 M \\RWn. (37) 

Proof of Lemma [7J By definition of the cb-norm, 
we have ||-R|| C & = sup fc {||i? ® idfc||oo}, where idfc is the 
identity operation on the k x k matrices B(Cfc). Every 
x G B(Hm) ® B(Ck) can be given the expansion 

x = m a ® k a 

OL 

Am 

= E E Ha,a KXil ® k a (38) 

a i,j=l 

where we have set Xij := y^ in Pa.a k a . Note that 
1 1 Xij 1 1 qo < H^Hoo V«,j = l,...,d M , implying that 

^R^id^xWoo = \\ ^2 RQiM) ® x ij\\°o 

< E Plloo|||i)01||oc||^-||oc 

< dif H-RIU ||arlloo (39) 
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holds independently of k. Consequently, we have 
\\R\\ c b = sup fc {||i2(g) idfeHoo} < d 2 M \\R\\oo, as claimed. ■ 

We now have the necessary tools at hand to tackle the 

Proof of Prop. |5| We will first prove the first 
part of Prop. |3J Thus, at this point we make no as- 
sumptions on the dimensionality of M., If S is forgetful, 
Eq. H32fl is immediate from the definition. In order to 
prove the converse, let 

d n :=M{\\S n -l M ®S n \\ ch | S n :M ^ A® n , linear}. 

(40) 

for n e N. Our strategy is to show that (d n )n£t>i satis- 
fies the conditions of Lemma From Eq. (l.'S2t we can 
then conclude that d n < c" for all n > N for some con- 
stant c < 1, and thus 5* is forgetful with exponentially 
vanishing errors by Remark [S] 

We start by showing that (d n ) ne jq is non-increasing, 
i.e., d n+ i < d„ V n £ N. From the definition of S„, we 
have 

S n+1 = (5®id®")o£„ 

+ (S®id® n )o(l M ®S n ) (41) 

= (S® id 5") (S n -1m® S„) + 1 M ® 1a ® S n 

where in the last step we have applied the unitality of 
S. From Eq. i|41f) and unitality of the cb-norm we may 
conclude that 

dn+l < \\S n +l ~ iM ® lA ® S n \\ cb 

< \\S®id® n \\ cb \\S n -l M ®S n \\ cb <d n ,(42) 

just as claimed. We will now show that d n+m < d n d m 
for all n, m € N. Similar to the above estimate, we have 



S 



(S n ® id; 



(S m -1 M ® s, 
+(s„ C 

S n ) ® id 



id® m ) (1 M ®S„ 



Sn 



(S n — 1m 

+ 1m ® S n +m, 

where we have introduced the short hand 



lm 



(43) 



SnA 



)5„ 



(S n ®id% m )(S m -l M ®S m ). (44) 



why we did not require the maps S n to be channels in 
the definition of the sequence (d n ) n ^. This completes 
the first part of the proof. ▲ 

For the second part, assume that M. = B(Hm) with 
did ■= dimT^M < oo. If Eq. ll3^|) holds, by the same 
reasoning as in Remark[S]we may conclude that 1m® &N 
may be replaced by (P ® id^) o SW(to), implying that 
for every m S M. and e > we may find a positive integer 
N e N such that 



\S N (m) 



(P®id® N )oS N (m)\\ oc <2e||m|| 0o . (46) 



In order to arrive at a uniform bound, let us introduce 
an orthonormal basis {(i)}^ for Hm- Since Hm has 
finite dimension, Eq. (|46|1 holds uniformally for the basis 
operators {|i)(j|}f" = i for some possibly larger N. Thus, 



by setting m = X)if=i 



\S N (m) - (P®id 



<S)N\ 



we see that 



< 



d M 

E 

i,3=l 



l i,3 I 



o 5iv(m)||, 
S N (\i)(j\)-(P® 



idT) 



001)11 



< 2s 



E 



\mi,j\ < 2ed\ L ||m|| c 



(47) 



< m 



where in the last step we have used that \rm,j\ 
for all i,j = 1, g?m- Making use of Lemma El we may 
conclude from Eq. (|4*T|) that 



S N - (P ® id 



■S 



< 2edi 



(48) 



,4 ) ^ ||ct) ^ * o «M- 

1 , we may find an integer N G N 



Thus, choosing e < 

such that Eq. lj32J) holds. Therefore, 5 is forgetful by 
the first part of the proof. The converse is immediate 
from the definition of forgetfulness. ■ 

From the proof of Prop. [S] we may immediately deduce 
the following 

Corollary 8 Let S: B ® M M ® A be a forgetful 
quantum channel. Then the effect of the initial memory 
vanishes exponentially fast, i. e., we may find a constant 
c < 1 such that 



\\S n — (Pwiu^ 
for all sufficiently large n. 



id® n )oS n 



cb < C" 



(49) 



Invoking again the unitality and multiplicativity of the 
cb-norm, we may conclude from Eq. I|43[) that 



\S n +m 1m ® <SVi+m||eb 

< ||5 n - lM®S n \\ cb \\S n 



1m 



(45) 



which is the desired estimate. Note that S n +m is clearly 
linear and unital, but not necessarily positive. This is 



For convenience, and because we will use it later in 
Section lVll in the following Proposition we show how the 
definition of forgetfulness translates into the Schrodinger 
picture language. 

Proposition 9 Let S: B ® M. — > Ai ® A be a quantum 
channel. Let e > 0, and for n G N let S n be defined as in 
Def.\Q Assume that 



(P, 



id® n )S n 



(50) 
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where P: M. — > C \m is a completely depolarizing chan- 
nel. We then have 

||tr B ®n S n * (qi - g 2 )\\i < 2s (51) 

for all density operators Qi, gi £ Ai* ® Af n such that 

Conversely, suppose that Eq. \51)) holds. Then Eq. \5(^) 
holds with the substitution e h* 2e. 

In particular, if the quantum channel S is forgetful, 
then from Remark [5J we know that the condition in 
Eq. l(5T!)l is satisfied, and thus Eq. {5T)l holds. If in 
addition the memory algebra M. is finite-dimensional, 
Eq. H5Ufl is a necessary and sufficient criterion for 
forgetfulness by Prop.JSJ By the above Proposition, 
Eq. H51fl then gives a necessary and sufficient criterion 
for forgetfulness in the Schrodinger picture language. 

Proof of Prop. |0| Note that for any linear oper- 
ator T:B — > A, the operator norm HTHoo equals the 
norm of the adjoint operator on the dual space, i. e., 

imu- sup ||r,(e)||i (52) 

llelK<i 

(cf. Ch. VI of [H or Section 2.4 of [H for details). 
Suppose that Eq. (JSUJ) holds. Since id^"®P* = tv M , the 
partial trace on the memory algebra M. , we may conclude 
from Eq. {SUJ and the norm duality Eq. that 

\S n *( Q )-S n *\x M Q\r<£ VgeM«®Af n , (53) 

which implies that for arbitrary gi, qi £ M.* ®A® n such 
that XxmQi — ^ T MQ2 we have 

||^«»(ei)-^n»(e2)||i<2e (54) 

by application of the triangle inequality. Eq. I|51[) then 
follows by noting that S n * = trg«n o S n *. 

Conversely, from Eq. I|51|l we can conclude that 

\\S w (g - tr M Q)\\i <2e V ge M*®Af n , (55) 

which implies Eq. (|50|l (with the substitution £ n 2e) 
by means of the norm duality Eq. 1)521) . ■ 

Prop. (and its Schrodinger dual Prop.^J) can be em- 
ployed to test whether a given quantum memory channel 
is forgetful. As an illustrating example, let us consider 
the unitary partial flip operation 

U v := cos 77 F + isinryl (56) 

with 77 £ [0, 27r), where F := J^i j V A denotes the so- 
called flip operator. Since ¥(b <S> m)F = m®b, for 77 = 
the partial flip is just the Shift Channel S s introduced in 
Section IlII CI which we know is forgetful. With the help 
of Prop. 03 we will show that the partial flip is forgetful 
whenever cos n > |. In fact, it is sufficient to prove that 

ll^-F||oo<^ (57) 



holds in the designated parameter range, since this will 
immediately imply that 

\\U*1 B ® (OC^-FIb® (-)¥\\ cb < 1, (58) 

from which forgetfulness of the partial flip follows by 
Prop. El To see that Eq. {53 holds, set A ?) := U v - F 
and observe that 

HAJJAJoo = 2(1-0087?) < ~ cos77>^. (59) 

It seems likely that the partial flip is in fact forgetful 
over the whole parameter range, apart from 77 = ^7r and 
77 = |7r. Evidence for this conjecture comes from the 
inve stig ation of so-called collision models by Ziman et al. 
[4fl l4l| . who could show forgetfulness of the partial flip 
when the input is restricted to product states g® n . 

We will prove below that forgetful quantum channels 
are dense in the set of quantum memory channels: for ev- 
ery non-forgetful quantum channel we may find a forget- 
ful memory channel which differs arbitrarily little from 
it. Thus, even the partial flip at 77 = i-7r and 77 = j-n 
(i. e., the identity 1) can be approximated by a forgetful 
quantum channel, though not necessarily a unitary one. 

What is more, along the lines of the example presented 
above Prop. [SI can be applied to show that all quantum 
channels in a finite-size neighborhood of a given forgetful 
quantum channel arc likewise forgetful, i.e., the set of 
forgetful quantum channels is open. Combined with the 
denseness of forgetful quantum channels, this justifies the 
claim made in Section fl Al that generic quantum memory 
channels are forgetful: 

Theorem 10 The set of forgetful quantum channels is 
open and dense in the set of quantum memory channels 
in || • \\cb-norm topology. 

Proof: We will first show that the set of forgetful quan- 
tum channels is dense in the set of quantum memory 
channels. From any given (not necessarily forgetful) 
memory channel S:B £g> M. — > A4 ® A we can easily 
construct a forgetful channel by mixing it with the com- 
pletely depolarizing channel 

D(b ® m) := tr((6 <g> m)S) 1 M ®A, (60) 

where 8 € B^f&M.* is an arbitrary quantum state. Just as 
in the classically mixed shift channel discussed above, all 
the terms in an 71-fold concatenation of the mixed channel 
S e := (l — e)S+eD yield the identity operator t M in the 
memory input, possibly apart from the SVi-contribution, 
which scales as (1 — e) n , and thus vanishes as n — > 00. 
Since this holds for all e > 0, and US' — S e \\ c b < 2e, we 
have found a forgetful channel S £ arbitrarily close to S, 
completing the proof. ▲ 

We will now show that the set of forgetful quantum 
channels is open. So assume that we are given a forget- 
ful memory channel S:B (£> M. — > M, ® A. We will show 
that S has a finite-size neighborhood in which all mem- 
ory channels are forgetful. Clearly, by the definition of 
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forgetfulness we can find N £ N and a quantum channel 
S N :M->A® N such that \\S N - 1 M ® S N \\ cb < |. Thus, 
for all memory channels T such that ||T — 5|| c & < 27v we 
have 

\\f N -l M ®S N \\ cb < \\S N -l M ®S N \\ cb +N\\T-S\\ cb < 1, 

(6i) 

and the forgetfulness of T immediately follows from 
Prop. El ■ 

It is instructive to observe that a forgetful channel is 
obtained from a possibly non-forgetful one in the dense- 
ness proof of Th. 1101 by adding a tiny amount of white 
noise. In real-world experiments, such noise will always 
be present at some level. Therefore, quantum channels 
encountered in the laboratory will generally be forgetful. 

However, while every non-forgetful quantum channel 
can be approximated by a forgetful memory channel 
to arbitrary degree of accuracy, their capacities may 
be different. As an example for such a discontinuity 
effect, consider the channel with a global classical switch 
introduced in Section IIII CI Let us assume that Alice 
and Bob face a situation in which Eve controls the initial 
memory state and completely jams the communication. 
Then adding a little bit of noise, as in the proof of 
Th. will deprive Eve of her control of the initial 
memory, and may lead to a channel with positive 
transmission rate. Thus, adding noise may actually 
be beneficial sometimes. Of course, it is just as easy 
to construct examples of memory channels which are 
rendered useless by adding a tiny amount of noise. 

In the special case of unitary quantum channels 
asymptotically vanishing memor y e ffects have been 
investigated by Wellens et al. |43 under the name 
asymptotic completeness, with a special focus on the 
preparation of arbitrary memory output states. While 
asymptotic completeness and forgetfulness are certainly 
related concepts, they seem to differ in fine points, 
for instance in the choice of the operator topology. 
Asymptotic completeness of the Jaynes-Cummings 
interaction, which governs the physics of the micromaser 
experiment described in Section [IBI is claimed as a main 
mathematical result in ji^. However, a proof is neither 
available in the cited literature |43j| . nor upon request 




VI. ENTROPIC BOUNDS AND CHANNEL 
CODING 

While in Section lill CI and Section UlI Dl we have com- 
puted the channel capacity of some interesting model 
channels, in this section we will be concerned with state- 
ments that apply more generally. In Section fVI Al we will 
give entropic upper bounds on the capacity for classi- 
cal and quantum information transfer. In Section IVI Bl 
achievability of these bounds will be demonstrated for 
forgetful quantum channels. 



A. Entropic Bounds 

It has already been pointed out by Bowen and Mancini 
[l^| that the standard mutual information bound (or 
Holevo bound) 45] on the classical channel capacity as 
well as the coherent information bound 0, 0, 13 
on the quantum capacity can be extended to quantum 
channels with memory. In fact, these bounds ultimately 
depend only on the mutual information between Alice's 
input register and Bob's output register, and are inde- 
pendent of the internal structure of the quantum chan- 
nel that links both parties. The proofs familiar from the 
memoryless setting can therefore be directly applied to 
memory channels, and yield entropic upper bounds on 
the classical and quantum capacity of a quantum mem- 
ory channel in all the four different settings discussed in 
Def.IU 

Before we state these bounds in Props. ^] and IT51 
below, we will need to introduce some notation and 
terminology In the following, the von Neumann en- 
tropy of a quantum state g £ B*(TC) will be denoted 
by H(g) := — tr(g\dg). Given a quantum channel (in 
Schrodinger picture) 5*:i3*(Hi) — > B ar {Ji.2) and an en- 
semble {pi, ft}( =1 of quantum states g.- L £ B*(Wi), where 
{Pi}{ = i is a classical probability distribution, Holevo's 
X-quantity is given by 

x(5*,fe,ft}) '■= H (^2p l S^(g i )) -^2piH(S*(Qi)). 

i=l i=l 

The coherent information / c (5*, g) of the quantum chan- 
nel 5* with respect to a state g £ B* (Hi ) is likewise given 
in terms of the von Neumann entropy, 

J c (5*,e) :=H(S.( e ))-H(S*®idQil>Mj)> ( 63 ) 

where ip £ TL\ ® 7ii is a purification of the quantum 
state g £ B*(Tti) [12] • With these notations, we have the 
following 

Proposition 11 Let 5„* be the n-fold concatenation of 
a quantum memory channel S*:B*(1-Im) ® B*(Ha) — * 
B*(TIb) §B*(Wjif). The classical information capacities 
of S are bounded from above as follows: 

Cab{S) < lim - max x(S n *, {Pi, ft}), (64) 

Cae(S) < lim - max xi^M °S n *,{Ph Qi}), (65) 

n^oo n {pi,Qi} 

Ceb,u{S)< lim - max x(5 n *> {f>i> M ® ft}), (66) 

IWOO n {pi,Qi} 

Cee,u{S)< lim - max x( tr A4 ° 5„*, {p i} /i <g> ft}), 

JWOO n {pi,Qi} 

(67) 

where /z £ B^(TLm) * s Eve's initial memory state. If 
djvf := dim^M < oo, the bounds in Eq. |6'^| ), Eq. f6'5|) 
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and in Eq. 166)) . Eq. {67}) coincide pairwise. If the chan- 
nel S is forgetful, the bounds in Eq. \6J t \) , Eq. \6b}) and 
in Eq. I6'.5j) . Eq. |6'7| ) coincide pairwise. 

Proposition 12 The quantum information capacities of 
the memory channel S are bounded from above as follows: 

Qab(S) < lim - m.axI c (S n *, g), (68) 

Qae(S) < lim - maxlJtiM ° S n », q), (69) 

n— >oo n Q 

Qeb,u(S) < lim —maxI e (Sn*,fj,®Q), (70) 

n^oo n q 

Qee,JS) < lim - max/ c (tr A 4 o S n *, A* ® ff), (71) 

where /z €E S*(Hm) *s Eve's initial memory state. If 
cIm < oo, t/ie bounds in Eq. {68\) . Eq. {69}) and in 
Eq. FTw , Eq. {7l[ l coincide pairwise. If the channel S 
is forgetful, the bounds in Eq. {fifty . Eq. {70}) and in 
Eq. \69\) . Eq. {71}) coincide pairwise. 

Remark 6 Note that the bounds in Props. ^] and IT21 
still hold when we only require that coding is possi- 
ble along some (possibly very sparse) block sequence 
I n Def. [l]we have been more ambitious, since 
we have required that coding works for arbitrary block 
size. When this stronger version of capacity is chosen, 
the lim can be replaced by lim in Eqs. 1)64(1 through l(71|l . 
While the "optimistic" and the "pessimistic" channel ca- 
pacity coincide for memoryless channels 0, this is not 
clear for channels with memory (cf. Remark 0J. For for- 
getful channels, equivalence does hold, as will be seen in 
Section rvTBl 

Proof of Props. 1111 and 1121 As indicated above, the 
proof transfers directly from the me mory less setting. We 
thus refer to Holevo's original work [45( for the classical 
bound, and to the works of Barnum et al. 0, 0, E3 
and Devetak ^] for the quantum case. 

Here we only show that the bounds coincide pairwise 
under the additional assumption of having a memory of 
finite size or a forgetful channel. We will begin with the 
finite memory case: Note that the Holevo quantity x 
decreases under quantum operations, i. e., 

X(R*S*APhQi}) < X(S*,{Pi>Qi}) ( 72 ) 

for any pair of quantum channels i?„ , S* and any ensem- 
ble of quantum states {pi, gi}i H3- We see from Eq. f?^|) 
that 

X{^M ° S n *,{pi, Qt}) < x(S n *,{Pt, ft}) 

< xfaM ° S n *,{pi, &}) + 21dt2 M , (73) 

where in the last step the subadditivity of von Neumann 
entropy has been applied |32| . From Eq. I|73|l it immedi- 
ately follows that the bounds on Cab and Cae coincide 



whenever d&i < oo. The proof for the bounds on Ceb.^ 
and Cee,^ is completely analogous. 

For the bounds on the quantum capacities, replace 
Eq. I|72l) by the Data Processing Inequality, i. e., 

/ c (fi,oS„g)<I c (S„ e ) (74) 

for any two quantum channels i?» and S 1 * |32| , and again 
apply subadditivity of von Neumann entropy. ▲ 

In the forgetful setting, in addition to subadditivity of 
von Neumann entropy we will also need to make use of 
its continuity properties. In fact, by Fannes' Inequality 
[12, EiJ we have 

\H(g)-H(a)\<\\g-a\\ 1 \dd+—, (75) 

e 

where g,a £ B^iTL) are quantum states, and d := dm\TL. 

By the results of Prop.[!|l forgetfulness of the channel S 
implies that for any e > we may find a positive integer 
to 6 N such that 

||tr B ® m S m * (gi - ga) ||i < s (76) 

for all density operators £>i,f?2 £ B*(Hm) ® B*(HA)® n 
satisfying tiMQi = ^ t mQ2- Applying Fannes' Inequality 
Eq. I|75|l and subadditivity of von Neumann entropy, we 
can thus conclude that for arbitrary /i £ B*(Hm) an d 
n € N we have 

X(S n *,{Pi,Qi}) < x(tr B ® m S n *,{pi,Qi}) + 2m\dd B 

< x(tr H ® m S n *, {pi,^JL® tT M (g l )}) + 2m\dd B 

+ - — - +2||tr B ®m5 n *(ft - fJ,^>tr M (g i ))\\ildd B 
e 

< max x(Sn*, {ij, H ® + 2 m\dd B 

21d e n 
H h 2ne\dd B . 

e 

(77) 

Maximizing over the ensemble {pi, gi}, dividing by n and 
letting n — > oo, we may conclude from Eq. (|77|) that 

lim - max x(S„», {pi, gi}) 

n—>oo n {pi,Qi} 

< lim - max x{S n *, {Pi, M ® ft}) + 2 e Id d B , (78) 

n— >oo n {pi,Qi} 

implying that for every /i £ B*(Hm) the bound on the 
classical capacity Ceb,^ is no smaller than the bound 
on the capacity Cab- The converse estimate is immedi- 
ate, since Alice can obviously choose quantum ensembles 
of the form {pi,p, CS> gi} if she has access to the input 
memory. The proof for the bounds on Cee,il and Cae 
is completely analogous, as is the proof for the quantum 
case. ■ 



B. Coding Theorems for Forgetful Channels 

In this section we will demonstrate that for forget- 
ful channels the entropic bounds on the classical and 
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quantum channel capacities presented in Prop. lTD and 
Prop.lT2*larc in fact achievable rates, and the limits exist. 

The idea of the proof is a reduction of the problem 
to the memorylcss setting via a relatively simple double- 
blocking procedure. To illustrate the strategy, let's start 
with the easy case in which there is a finite integer me N 
such that 



S m = (P®id® m )oS„ 



(79) 



where P: M. — > C \m is again the completely depolariz- 
ing channel. We call channels with this property strictly 
forgetful, and the smallest integer m such that Eq. 179(1 
is satisfied will be called the memory depth of the chan- 
nel S. For the processing of long messages, we group 
the channels into blocks of length to + I and ignore the 
outputs of the first to channels of each block, while the 
actual coding is done for the remaining I channels. Even- 
tually we will let I — > oo. When we restrict the inputs to 
product states of block length m + l, due to strict forget- 
fulness the output state factorizes, and the whole setup 
corresponds to a memoryless channel on the larger input 
space H A ^ +m . For the transmission of classical informa- 
tion, we can then app ly the standard random coding tech- 
niques of Holevo [51j and Schumacher and Westmoreland 
|52j | . Invoking subadditivity of von Neumann entropy as 
in Section fVl Al the rates R which can be achieved with 
this coding scheme are seen to be bounded as follows: 

1 2 TO 

— max x(Si*, {Pt, Qi}) -7 ldd B 

I + m {pi,Qi} to + / 

<R< j max x(Si*,{Pi,Qi})' (80) 

The claim then follows by letting I — > 00. For quantum 
channel capacities, Devetak's coding theorem 49] can be 
shown to yield an analogous bound, in which the Holevo 
quantity is replaced by coherent information. 

It turns out that we can apply the same double- 
blocking strategy even if the memory channel S is 
merely assumed to be forgetful (and no longer strictly 
forgetful). However, in this case the output does not 
completely factorize, and the error we pick up by re- 
placing the memory channel with a memoryless channel 
on larger blocks grows with the number of blocks. 
Luckily, all memory effects can be assumed to vanish 
exponentially fast by Corollary |SJ 

While in this paper we have focused on the classical 
and quantum channel capacities proper, Devetak's proof 
of the quantum channel coding theorem |49| is based on a 
coherentification scheme for the private classical channel 
capacity. The setup for private information transfer (in- 
cluding the definition of rates and capacity) is almost the 
same as for classical channel capacity, but the protocols 
have to satisfy the additional requirement that (almost) 
no information is released to the environment. 

More formally, assume that a quantum channel 
T*:B*(Ha) —* B*(Hb) is implemented by the Stinespring 



isometry V: Ha — > ® He, i- e., 

T*( Q )=tr E V Q V* V geB4H A ) 



(81) 



(cf. Section of the Appendix for details). By we 
then denote the channel that arises from T* by inter- 
changing the roles of Hb and He, i- e., 



Tf(g) :=tT B VgV* V g e B,(H, 



(82) 



This channel describes the information flow into the envi- 
ronment. Privacy in Devetak's coding scheme for mem- 
oryless channels then means that for sufficiently large 
neNwe may find an operator Q £ B(HE)® n such that 



1 VE 
ve ^ 



(Qjh 



e 



fe=i 



<e Vj = l,...,v B , (83) 



where {Qjk} V j=ik=i IS a set °^ codewords, and vb = 2 nR 
describes the size of the code space necessary to attain the 
rate R > 0. We see from Eq. (|83|l that privacy is achieved 
by randomizing over part of the codewords, leading to 
smaller code spaces. Devetak could show [43 that the 
capacity C P (T) of a memoryless quantum channel T for 
private classical information transfer is given by 

C(T)= Hm - r max{x(T®*,{p i>ft }) 

(84) 

- X (T E ® n ,{Pi,Qi})}, 

where \ is the Holevo quantity introduced in Eq. (|62|l . 

It is a coherent version of this private classical informa- 
tion protocol which yields the quantum channel coding 
theorem. Note in particular that if g — PilV'iXV'il is 
a decomposition of g G B*(Ha) into pure states, we have 



Ic(T, g) — x(T, {pi, IV'iXV'il}) - x{T E ,{ P iMi)m}) 



(85) 

by the Joint Entropy Theorem (cf. Th. 11.8 of [3^1. 

As described above, part of our strategy in this Section 
will be an extension of Devetak's coherentification pro- 
tocol to forgetful quantum channels. In fact, the coher- 
entification protocol itself applies generally and does not 
depend on the internal structure of the quantum channel 
that links the sender to the receiver and the environ- 
ment. Thus, our proof of the quantum coding theorem 
amounts to showing that the privacy condition Eq. I|83|) 
can be satisfied for forgetful quantum channels. Conse- 
quently, in the course of the proof we will also obtain a 
coding theorem for the private classical information of 
forgetful quantum channels. We thus have the following 

Theorem 13 Let Ha, Hb, and Hm be finite- 
dimensional Hilbert spaces, and let us assume that 
S*:B*(H M )®B*(H A ) -> B*{H B )®B*{H M ) is a forgetful 
quantum channel. By S n * we denote its n-fold concate- 
nation. With the convention introduced in Remark^ we 



16 



then have 



C*(S) = lim - max x(Sn*, {pi, Qi}), 

n— >oc n {pi,Qi} 



(86) 



C*{S) = lim - max x(S w , {Pi, ft}) ~ xO^f*, {Pi> ft}), 

(87) 



Q*(S) — lim — max/ c (5 rl *, g). 
n^oo n g 



Proof: The proof of the upper bound on the private 
classical capacity Cl{S), i.e., 

C P AB {S) < lim - max x(S n *, fe, Qi})-x(Sn*i ft})' 
n— ><x> n {pi,gi} 

(89) 

is completely analogous to the one for the memoryless 
case |43- For Cab(S) and Qab(S), corresponding re- 
sults have been presented in Props. 1771 and 1721 To com- 
plete the proof it thus remains to show that 

Cee^IS)^ lim - max x(S n *> {Pi, ft}) (90) 

n^oo n {pi,Qi} 

for all (i G B^{T-Lm), and that the limit on the right hand 
side of Eq. l|§UJl exists, and correspondingly for C EE (S) 
and Qee,h(S). 

The definition of forgetfulness combined with Corol- 
lary!!] implies that we may find a sequence (S'm)meN of 
quantum channels such that 



ISw- 1 



M 



(91) 



for some constant c > 1. 

As described above for the case of strictly forgetful 
channels, our strategy is then to group the memory chan- 
nels into blocks of length m + l, to ignore the outputs on 
the first m channels of each block, and to replace the 
resulting channel T m+ i := (S m <8> id ) o Si by the mem- 
oryless channel 



id 



o Si 



(92) 



For Alice, this coding procedure means that she will have 
to feed the first m inputs of each block of length m + I 
with some standard state u> G B*(TiA)® m , while she will 
use the remaining I inputs of each block for the actual 
coding. Bob will ignore the first m output signals of 
each block, and will run his decoding algorithm on the 
remaining I signals. 

Let us focus on the classical information capacity first, 
and assume that we have a coding scheme for the mem- 
oryless channel T m+ i that achieves the rate R G M. By 
definition of capacity, this means that for every e > 
there is an integer N £ G N such that for every n > N £ 
we may find a code book with v := [2"'^] codewords 
{Qj}j=i C S*(^a) 8! ™ and a corresponding observable 
{MjY j=1 C B{H B )® ln such that 



> 1 



(93) 



uniformly in {Qj}j = i- By the results of Holevo j5J and 
Schumacher and Westmoreland |§2 , such coding schemes 
exist for all rates R < ^^jCi(Ti), where Ci(T)) denotes 

the product state capacity of the memoryless channel T} . 

For the private classical information capacity, the 
setting is basically the same, but the codewords 
{Qjk}j=i fc=i carry a second index to allow for random- 
ization, and there exists an operator G B(H.E)® nl such 
that 



1 VE 

k—1 



<e Mj = l,...,u B (94) 



(cf. Eq. (|83|) above). Here the size of the code is given 
by v B = \2 nlR \, and all rates R < j^Cf(fi) may be 
achieved. 

The same product coding scheme will now be applied 
to the concatenated memory channel T m+ i. Our objec- 
tives are to show that 

(a) this coding scheme satisfies the decoding condition 
Eq. 

(b) in the case of private information transfer, the pri- 
vacy condition Eq. (|94|l holds, and 

(c) the attainable rates can be made arbitrarily close 
to the entropic upper bounds. 

This will immediately imply the coding theorem for 
classical and private classical information transfer. The 
quantum channel coding theorem will then follow from 
the coherentification of the private classical protocol, as 
explained in detail in Devetak's original work . 

Let us start with the decoding condition (a). Assume 
that in n blocks of length m + I each, the replacement 
T m+i i ^ T m+ i is made. Since HT,^/ — 2m+z||c6 5: c 
for each of these blocks by Eq. I|91|l . the concatenated 
channels satisfy 

\\T n ( m+ i)-TT + iU < nc- m . (95) 

Making use of the norm duality Eq. 1)521) , we can conclude 
from Eq. lj9"5|) that 



\\T n {m+i)*{Q) - T®+ u {q)\\i < nt 



(96) 



Noting that for any two quantum states g, a G B*(7i) 
and any observable {Mj}" =1 C B(Ti) the inequality 



3 = 1 



(97) 



holds (cf. Th. 9.1 of HH), we may infer from Eq. JSBJl 
that for all codewords {Qj} u ; =1 G B*(H.j 



\0ln 



> tr f^MMj - \\T n(m+l> ( ej ) -f^MW, 



- 11 1 m+U 
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For e > 0, choose n := I, m := el and I sufficiently large 
such that Eq. I|93|) is satisfied. We may then conclude 
from Eq. (§BJ that 



large enough I, we may conclude from Eq. i|101|) that 



trTja(x+ e )*(ej)Mj > l-2e 



(99) 



uniformly in j for sufficiently large I, implying that the 
product channel random coding scheme leads to asymp- 
totically vanishing errors for all rates R < C 1 (T)) and 

R<j^ Cf(fi), respectively. 

We will now show that (b) also holds, with the same 
substitution e i— > 2e. To this end, we note that Devetak's 
randomization scheme can be slightly modified to include 
the output memory state of each block. By this trick we 
may guarantee that in an Wold concatenation of blocks of 
length m + l each, even the intermediate blocks, for which 
no coding is done and the respective outputs are ignored, 
are (almost) uncorrelated with Alice's signal states. 

Making again use of the error estimate for concate- 
nated channels and the norm duality Eq. (|52[) . we may 
then conclude from Eq. (|94|l that 



VE 



< 



k=l 



1 VE r 



(100) 



for sufficiently large Z, as advertised. Note that without 
the additional randomization over the output memory, 
the average mutual information ^H(A : E) between the 
signal states and Eve's output states will still be small. 
This is due to the fact that in the above coding scheme 
the intermediate blocks only constitute a fraction e of the 
total length. However, this is in general not sufficient to 
conclude that a norm estimate such as Eq. (|100fl holds. 

In order to conclude the proof, it only remains to 
show that Ci(Ti) can be bounded from below in terms 
of maxj pitBi \ x(Si*i {Pi, Qi}) f° r large I, and similarly for 
the private classical and quantum capacities. 

Applying subadditivity of von Neumann entropy and 
Fannes' Inequality Eq. I|75|) . we see that 

x(Si*,{Pii Qi}) < x(Ti+el*,{Ph Qi}) + 2elldd B 

21de 



< x(Ti +£ u,{p i ,g i }) + 2el\dd B + 



+ 2l(l + e)e\dd B 
<l{l + e)C 1 {f l ) + 2el\dd B + 
+ 2l{l+s)e\dd B . 



(101) 



21de 



C 



EE^(S) > — 



lim - max x(&i*, {Pi, Qi}) 

l^OO I {pi,Qi} 



Ae\dd B - 2e 2 ldd f 



(102) 



Since C\(T{) has been shown to be an achievable rate for 



Since e > is arbitrary, Eq. I|102l) together with the upper 
bound in Prop.^2 entails that 

Cee,u(S)= lim - max x{S n *, {Pi, Qi})- (103) 
?i^oo n {pi.Qi} 

The coding scheme described above uses blocks of length 
ni := l 2 (l + e). This is a subexponential sequence 
in the sense of Remark^] and we may thus apply the 
One-Sequence Theorem l] to conclude that the limit in 
Eq. I|103|) exists, implying that Eq. holds. The rate 
estimate for the private classical and quantum capacities 
is completely analogous. I 



VII. SUMMARY AND OUTLOOK 

We have presented a general model for quantum chan- 
nels with memory, and shown that under mild causality 
constraints every quantum process can be thought of as 
a concatenated memory channel (plus some memory ini- 
tializer). 

For these memory channels, channel capacities have 
been introduced along the lines familiar from the mem- 
oryless context, and it has been demonstrated that dif- 
ferent operational setups may lead to different values of 
the channel capacity. 

While we have concentrated on the classical and quan- 
tum channel capacities proper, it is evident that the the- 
ory may be extended to memory channels assisted by 
additional resources, such as entanglement and classical 
side communication. As seen in Section IVI Al entropic 
bounds typically depend only on the amount of informa- 
tion shared by sender and receiver, and not on the inter- 
nal structure of the quantum channel linking these two. 
Coding theorems for memoryless channels can easily be 
extended to forgetful memory channels, as demonstrated 
in Section IV! El They typically lead to regularized ex- 
pressions for the channel capacity, which still require the 
solution of optimization problems in Hilbert spaces of 
exponentially growing dimensionality. In general, com- 
puting capacities of quantum memory channels is thus 
at least as challenging as for memoryless channels, with 
less hope for improvements. 

A general study of the resulting capacity landscape 
is still pending. In particular, we do not yet know un- 
der which general conditions some (or all) of the channel 
capacities introduced in Def. Q] coincide. It may seem 
reasonable to conjecture that, as long as the memory 
system is finite-dimensional, it is irrelevant for capacity 
purposes whether Bob or Eve control the final memory 
output. While this is almost immediate for the entropic 



18 



upper bounds on the channel capacities (cf. Prop. 1111 
and Prop. lT2|> . so far we have not been able to verify this 
conjecture for the capacities themselves. 

We have demonstrated in Section^that generic mem- 
ory channels are forgetful, and in Section IVI Bl we have 
presented coding theorems for this very important class 
of channels. This may seem as if it were possibly to al- 
ways restrict one's attention to forgetful channels. How- 
ever, the capacity of a memoryless channel is sometimes 
discontinuous in its parameters. So while it is always 
possible to approximate a given non-forgetful channel by 
a forgetful channel to arbitrary degree of accuracy, their 
capacities may be very different, as the example given in 
Section [V] demonstrates. This calls for a more detailed 
analysis of non-forgetful quantum channels and their ca- 
pacities. 

While we have presented several equivalent criteria for 
a memory channel to be forgetful (cf . Section , we 
do not yet have a Structure Theorem to characterize all 
the non- forgetful quantum channels, nor do we have a 
simple test to decide whether a given memory channel is 
forgetful. 

Apart from some relatively simple model channels, lit- 
tle is known so far about the channel capacity of general 
non-forgetful memory channels. The derivation of coding 
theorems in this case is likely to require universal cod- 
ing schemes, with encoders and decoders independent of 
Eve's choice of the initial memory state. For the memory 
channel with a global classical switch (cf. Section lill Cfl . 
universal coding schemes do exist However, this is 
a rather special example of a memory channel, and the 
general case remains very much open. 
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APPENDIX 

In this section we provide some mathematical back- 
ground on the description of infinite-dimensional quan- 
tum systems by quasi-local algebras, and on quantum 
channels between such algebras. We start with a quick 
summary of C*-algebra terminology, and then concen- 
trate on those aspects which are essential to the proof of 
the Structure Theorem in Section IIVI For an in-depth 



treatment we refer to the texts of Bratteli and Robinson 
[H, Ruelle [H, and Paulsen H3- 



1. C*-Algebras 

The operations making up the abstract structure of 
C*-algebras are inspired by those known from algebras 
of bounded operators B(H) on a Hilbert space H. In 
fact, every such operator algebra is a C*-algebra, and 
conversely every abstract C*-algebra is isomorphic to a 
norm-closed self-adjoint algebra of bounded operators 
on a Hilbert space. More details on this fundamental 
structure theorem for C*-algebras will be provided in 
Section below. 

A C*-algebra A is a vector space on the complex num- 
bers C which is equipped with a product a x b i— > ab for 
a, b G A. The product is assumed to be distributive and 
associative, but not necessarily commutative. In addi- 
tion, A has an adjoint operation (also called star opera- 
tion or involution) A 3 a i— » a* G A. This is conjugate 
linear (or anti-linear), i.e., (aa + (3b)* = aa* + (3b* for 
all a, b G A and a, G C, and has the properties a** = a 
and (ab)* = b*a* . Physicists often write a + or a' instead 
of a*. 

Besides, there is a norm \\ ■ ||oo on A which associates 
a non-negative number Halloo to every a G A such that 
IH|oo = implies a = 0. With respect to the algebraic 
properties of A, the norm satisfies ||aa||oo = |a| ||a||oo> 
the triangle inequality ||a + o||oo < ||a|l°° + ||o||oo and the 
product inequality Ha&Hoo < ||a||oo ||&||oo for all a,b G A 
and a G C. In addition, we have ||a*a|| 00 = Halloo- 

An identity 1a of a C*-algebra A is an element of A 
such that 1.4 a = a = a 1.4 for all a € A. A C*-algebra 
can have at most one identity. However, not all algebras 
come equipped with an identity. The absence of an 
identity can complicate the structural analysis, but 
these complications can be avoided by embedding A in 
a larger algebra A which has an identity. Here we will 
always assume that A possesses an identity. Unless the 
algebra is identically zero, we then have ||l^||oo = 1- 

A state on the C*-algebra A is a linear functional 
lu:A — ► C which is positive in the sense that a; (a* a) > 
for all a € A and normalized such that oj(1a) = 1- If 
A = B(Ha) for some finite-dimensional Hilbert space 
Ha, to every state ui there exists a unique density oper- 
ator G B*(Ha) such that 

w(a)=tr(^a) V a G A. (AA) 

For infinite-dimensional systems, there may be states 
which cannot be represented as density operators in the 
sense of Eq. (|A.1|) . 

The commutant A' of a C*-algebra A is the set of all 
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operators a £ A that commute with A, i. e., 

A' :={aeA | ab = baVbeA}. (A.2) 

A' is a sub-algebra of A. If A' = A, all operators in 
A commute, and the algebra is called Abelian. These 
algebras describe classical systems. 

2. Quasi-Local Algebras 

Quasi-local algebras are adapted to the description of 
infinitely extended quantum lattice systems. The frame- 
work discussed in this Section works for any lattice struc- 
ture in any spatial dimension. In fact, it does not even 
require translational invariance and can be formulated 
for possibly different quantum (or classical) systems lo- 
calized on the nodes of a finite or infinite graph. How- 
ever, our interest is in the input and output signals of a 
causal automaton, and we may thus restrict our discus- 
sion to the simple case in which the lattice consists of 
a one-dimensional spin chain labelled by integers zeZ. 
To each site z£Z we assign an isomorphic copy A z of 
the observable algebra A, which in our case is a finite- 
dimensional C*-algebra B(Ha) or B(Hb) of Alice's input 
and Bob's output system, respectively. When A C Z is 
a finite subset, we denote by Aa := ® z ^a^ z ^ ne 
gebra of observables belonging to all sites in A. When- 
ever Ai C A2, tensoring with the identity operator 1.4 
on A2 \ Ai will make A\ t a sub-algebra of Aa 2 ■ In the 
same way the product a\ ai of operators ai S *4a ; be- 
comes a well-defined element of Aa 1 ua 2 - Since tensoring 
with the identity 1a does not change the norm, this con- 
struction yields a normed algebra of local observables. Its 
norm-completion is called quasi-local algebra, and will be 
denoted by 



A z := (J Aa- (A.3) 

AcZ 

Similarly, for infinite subsystems A C Z we define Aa as 
the closure of the union of all A\> for finite A' C A. In 
particular, by A- := -4.(-oo,o] and A+ := A[i t0O ) we will 
denote the left and right half chain, respectively. 

The algebra Aa is interpreted as the algebra of physical 
observables for a subsystem localized in the region A C Z. 
The quasi-local algebra then corresponds to the extended 
algebra of observables on the infinite spin chain Z. 

On the spin chain we introduce a shift operator a by 
setting 

a: Aa — » Aa+i a ~ a eg) 1a 1— > a(a) := 1a eg) a ~ a, 

(A.4) 

where we have used the notation A+l := {z + 1 \ z € A}. 
The canonical extension of a onto the quasi-local algebra 
Az, is a "-automorphism on Az, and the integer powers 
{°~ z }zez represent an action of the translation group Z 
by automorphisms on Az- 



As explained in Section^ a state to on the spin chain is 
a positive and normalized linear functional on Az ■ Equiv- 
alently, a state to is given by a family {o>a}acz of den- 
sity operators on Aa for finite A C Z such that to (a) — 
tr(wAa) for a G Aa- The local density matrices have to 
satisfy the consistency condition that trA 2 \Ai^A 2 = w Ai 
whenever Ai C A2. This equivalence reflects the fact 
that the state of the entire spin chain is assumed to be 
determined by the expectation values of all observables 
on finite subsystems A C Z. 

3. Stinespring's Representation 

Quantum channels, as introduced in Section III AI are 
completely positive and unital maps S: B — > A between 
observable algebras B and A attributed to physical 
systems. In Heisenberg picture language, they describe 
how observables (and thus expectation values) transform 
when the system under consideration undergoes a free 
or controlled evolution. 

By Stinespring's famous representation theorem [55| . 
for every completely positive (not necessarily unital) map 
S:B —>■ B(Ha) we may find a Hilbert space K. and an 
isomctry V: Ha ~ > £ such that 

5(6) = V*Tr(b)V VbeB, (A.5) 

where tt:B — > B(JC) is a "-representation, i.e., a linear 
operator that preserves the algebraic structure in that 
7r(6i6 2 ) = 7r(6i)7r(6 2 ) and n(b*) = n(b)* . 

If the output system B is finite-dimensional, the rep- 
resentation Eq. (|A.5|I takes the simpler form 

S(b) = V* (big) tjc)V VbeB (A.6) 

with the Stinespring isometry V:TLa ~^ T~Cb <8 tC, where 
B = B(Hb) with dimHs < 00. By means of the duality 
Eq. J2Jl, in Schrodinger picture this form of Stinespring's 
Theorem gives rise to the ancilla representation of the 
quantum channel S* , 

S*(Q) = tT K V(Q® Qo )V* V f> € B*(Ha), (A.7) 

where qq G B(JC) is a so-called ancilla state. The Kraus 
representation Eq. Q follows from Eq. (|A.6() by intro- 
ducing a basis in K. 

A triple (JC,n,V) as obtained in Stinespring's Theo- 
rem Eq. i|A.5|) is usually called a Stinespring represen- 
tation for the channel S. If the closed linear span of 
it(B)VTLa equals K., the representation is called mini- 
mal. Minimal Stinespring representations are unique up 
to unitary equivalence, in the following sense: Assume 
that the quantum channel S has a minimal Stinespring 
representation Eq. (|A.5() as well as a further (not neces- 
sarily minimal) one 

S(b) = Vi* TTi (b)V 1 VbeB (A.8) 
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with another Stinespring isometry Vi:Ha Ki. Since 
the representation Eq. I|A.5|) is assumed to be minimal, 
we conclude that dim/C < dim/Ci, and the prescription 

W(ir(b)Vip) := ni(b)Viip (A.9) 

for b £ B and ip G Ha yields a well-defined isometry 
W:JC — > /Ci. From the definition of W we find that the 
intertwining relation W~k = ttiW holds, implying that 
Wir(b)V = 7ri(6)Vi for all 6 G £>, and thus Vi by 

setting 6 = le- The uniqueness statement plays a cen- 
tral role in the Structure Theorem for quantum memory 
channels (cf. Section Hvjl . 

4. GNS-Representation of Quantum States 

A state uu: B — > C, as defined in Section Q above, is a 
unital and positive linear map. Since the range algebra 
C is Abelian, it is even completely positive (cf. pfjj . 
Th. 3.9), and thus we may apply Stinespring's Theorem 
to conclude that u> can be given the representation 

w(6) := (0|tt(&)|0) VbeB, (A.10) 



where := V(l). Eq. (|A.10|) is usually called the 
GNS-representation of quantum states, after Gelfand 
and Naimark j^, and Segal |57j . 

The GNS Theorem can be applied to prove the 
basic structure theorem of C*-algebras: 

Theorem 14 Every C* -algebra A is isomorphic to a 
norm-closed self-adjoint algebra of bounded operators on 
a Hilbert space. 

The idea of the proof is to construct for each state cu of 
A the corresponding GNS representation {K^ , 7r w , ) , 
and then to form the so-called universal representation 
by setting 

/C:=0/C w and tt:=0^. (A.ll) 

UJ UJ 

The existence of sufficiently many states is guaranteed 
by the Hahn-Banach extension theorem. The details are 
spelled out in Section 2.3 of |39j . 
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